Skip to content

Add runtime call tracing#1421

Draft
sbryngelson wants to merge 4 commits into
MFlowCode:masterfrom
sbryngelson:runtime-call-trace
Draft

Add runtime call tracing#1421
sbryngelson wants to merge 4 commits into
MFlowCode:masterfrom
sbryngelson:runtime-call-trace

Conversation

@sbryngelson
Copy link
Copy Markdown
Member

@sbryngelson sbryngelson commented May 11, 2026

Summary

Adds an opt-in runtime call tracing mode for MFC Fortran targets.

  • instruments generated Fortran sources during trace builds
  • adds a lightweight trace runtime that resolves symbols and call sites
  • traces the local middle grid point by default during run --trace
  • writes trace output to trace.log in the case directory and mirrors to stdout only for terminal runs
  • keeps build-time --trace in the build lock while making run-time --trace a separate non-persisted run option
  • supports GNU, NVHPC, and IntelLLVM CPU trace builds

Validation

  • ./mfc.sh precheck -j 12
  • parser sanity check: build --trace sets build config trace=True; run --trace sets runtime_trace=True without changing build config trace
  • GNU/NVHPC CPU trace runs previously verified trace.log generation and no duplicated trace lines in redirected stdout
  • IntelLLVM 2024.2.1 CPU/no-MPI trace build with ifx/icx: pre_process and simulation
  • IntelLLVM short Sod run: exit 0, trace.log generated with 1928 trace lines, redirected stdout had 0 trace lines

Notes

This is draft because the runtime tracing implementation is intentionally broad and may need review around compiler support, trace volume, and the generated-source instrumentation heuristics.

@github-actions
Copy link
Copy Markdown

Claude Code Review

Head SHA: cfca415

Files changed:

  • 13
  • CMakeLists.txt
  • src/common/m_trace.f90
  • src/common/m_trace_runtime.c
  • toolchain/instrument_fortran_trace.py
  • toolchain/mfc/build.py
  • toolchain/mfc/cli/argparse_gen.py
  • toolchain/mfc/cli/commands.py
  • toolchain/mfc/cli/completion_gen.py
  • toolchain/mfc/cli/docs_gen.py
  • toolchain/mfc/cli/schema.py

Findings

1. Data race on trace_initialized in m_trace_runtime.c

trace_initialized is a plain global int (non-atomic, non-TLS):

static int trace_initialized = 0;        // NOT __thread

__cyg_profile_func_enter is called on every Fortran function entry from every thread. With hybrid MPI+OpenMP runs (which MFC supports), multiple threads can simultaneously read trace_initialized == 0, pass the guard, and race through mfc_trace_initialize() concurrently:

static void mfc_trace_initialize(void) {
    if (trace_initialized) return;       // unsynchronised read
    trace_env = getenv("MFC_TRACE");     // getenv not thread-safe under concurrent setenv
    ...
    trace_initialized = 1;               // unsynchronised write
}

getenv is not required to be thread-safe in the presence of concurrent setenv/putenv (POSIX leaves it implementation-defined). The overlapping writes to the other global trace_* fields (trace_enabled, trace_point_enabled, etc.) are also unprotected. The simplest fix is to use pthread_once or a C11 atomic_flag for trace_initialized, or a constructor-attributed function that runs once at startup.


2. Stack push in __cyg_profile_func_enter precedes trace_in_hook guard

The stack push happens before trace_in_hook is set:

void __cyg_profile_func_enter(void *this_fn, void *call_site) {
    if (trace_in_hook) return;

    if (trace_stack_depth < MFC_TRACE_STACK_MAX) {
        trace_stack_ids[trace_stack_depth] = trace_next_stack_id++;
        trace_stack_call_sites[trace_stack_depth] = call_site;
        trace_stack[trace_stack_depth++] = this_fn;   // stack modified here
    }

    trace_in_hook = 1;                                // guard set after
    mfc_trace_initialize();
    ...
}

mfc_trace_symbolmfc_trace_symbol_from_addr2line issues popen("addr2line ..."). popen internally calls fork. On Linux, the child inherits TLS with trace_in_hook = 0, so any instrumented function called in the child between fork and exec re-enters the hook without the guard firing, pushing an additional frame onto the child's TLS stack. The child exits via exec immediately, making this harmless today, but the window between stack-push and trace_in_hook = 1 is a latent re-entrancy hazard if the symbol-resolution path ever changes. Moving the stack push to after trace_in_hook = 1 closes the window.


3. CMakeLists.txtMFC_TRACE_CALLS=ON silently no-ops for Intel ifx and Cray ftn

Instrumentation flags are only added inside the GNU and NVHPC/PGI compiler blocks; there is no else branch emitting a warning or error when an unsupported compiler is used:

if (CMAKE_Fortran_COMPILER_ID STREQUAL "GNU")
    if (MFC_TRACE_CALLS)
        add_compile_options($<$<COMPILE_LANGUAGE:Fortran>:-finstrument-functions> ...)
    endif()
elseif ((CMAKE_Fortran_COMPILER_ID STREQUAL "NVHPC") OR ...)
    if (MFC_TRACE_CALLS)
        add_compile_options($<$<COMPILE_LANGUAGE:Fortran>:-Minstrument=functions> ...)
    endif()
# no else: Intel ifx and Cray ftn silently get nothing

A developer building with ifx or Cray ftn and -DMFC_TRACE_CALLS=ON gets a binary without instrumentation, an empty trace.log, and no indication of why. A message(FATAL_ERROR ...) or at minimum message(WARNING ...) guard directly under if (MFC_TRACE_CALLS) outside the compiler-id blocks would make the limitation explicit at configure time.

@sbryngelson sbryngelson force-pushed the runtime-call-trace branch 4 times, most recently from 7f6ced3 to 3cf7572 Compare May 11, 2026 01:06
@sbryngelson sbryngelson force-pushed the runtime-call-trace branch from 3cf7572 to 004d8d2 Compare May 11, 2026 01:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant